The Permanent Magnet Synchronous Motor (PMSM) is used in many application: electric vehicle, wind turbines, motorcycles: the main advantage of this motors is the low size/weight.
For the best functionality of the PMSM, the temperature of the rotor/magnetic must be monitored by the controle unite to avoid the overheating of the permanent magnetic.
Many solutions are possible:
One of the new solution is to use the AI/ML or deep learning to estimate/predict the temperature of the rotor, it can be a low cost solution since we delete the temperature sensor + all system of power supply + communication + filters ... and it can be best that the conventional estimator that use the physical equation. the main default of those estimator is that we need the exact model of the motor to be accurate.
In the below project, we will compare some of regression ML to predict the rotor temperature of the PMSM
We wil use a dataset from kaggle
For more information about this database please see the link
Below the comparison of the 5 models
FYI: The actual file is a merge of 8 notebooks, so you will find redundancy in importing the libraries and other parts like "Load data": to see the individual notebook, please see the folder "notebooks"
import pandas as pd
import numpy as np
import matplotlib.pyplot as pltt
from matplotlib.pyplot import figure
import seaborn as sns
from utils import *
You can download the dataset form Kaggle with the below link
https://www.kaggle.com/datasets/wkirgsn/electric-motor-temperature/data
Dataset Features
All recordings are sampled at 2 Hz. The data set consists of multiple measurement sessions, which can be distinguished from each other by column "profile_id". A measurement session can be between one and six hours long.
df=pd.read_csv(r'../data/raw/measures_v2.csv.zip')
df.head(2)
| u_q | coolant | stator_winding | u_d | stator_tooth | motor_speed | i_d | i_q | pm | stator_yoke | ambient | torque | profile_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -0.450682 | 18.805172 | 19.08667 | -0.350055 | 18.293219 | 0.002866 | 0.004419 | 0.000328 | 24.554214 | 18.316547 | 19.850691 | 0.187101 | 17 |
| 1 | -0.325737 | 18.818571 | 19.09239 | -0.305803 | 18.294807 | 0.000257 | 0.000606 | -0.000785 | 24.538078 | 18.314955 | 19.850672 | 0.245417 | 17 |
df.info( memory_usage='deep')
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1330816 entries, 0 to 1330815 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 u_q 1330816 non-null float64 1 coolant 1330816 non-null float64 2 stator_winding 1330816 non-null float64 3 u_d 1330816 non-null float64 4 stator_tooth 1330816 non-null float64 5 motor_speed 1330816 non-null float64 6 i_d 1330816 non-null float64 7 i_q 1330816 non-null float64 8 pm 1330816 non-null float64 9 stator_yoke 1330816 non-null float64 10 ambient 1330816 non-null float64 11 torque 1330816 non-null float64 12 profile_id 1330816 non-null int64 dtypes: float64(12), int64(1) memory usage: 132.0 MB
Number of sessions
len(df.profile_id.unique())
69
df2=df.copy()
#voltage magnitude
df2['us']=np.sqrt(df2.u_d**2+df2.u_q**2)
# current magnitude
df2['is']=np.sqrt(df2.i_d**2+df2.i_q**2)
# Electric apparent power
df2['Se']=1.5*df2.us*df2['is']
# Mecanical power
df2['Pm']=df2.torque*(2*np.pi*df2.motor_speed/60)
Mechanical power (Pm) vs apparent electrical power (Se)
# Plot only Pm> 0 beacause Se > by definition
df2[df2.Pm>0].plot(x='Se',y='Pm',kind='scatter',s=1,figsize=(5,3),label='data')
plt.plot([0,50000],[0,50000],c='r',label='Pm=Se')
plt.title('Mechanical power (Pm) vs apparent electrical power (Se)')
plt.legend()
plt.grid()
plt.show()
Torque (N.m) Vs Motor speed (tr/min)
df2.plot.scatter(x= 'motor_speed',y='torque',c='Pm',figsize=(5,3))
plt.title('Torque (N.m) Vs Motor speed (tr/min)')
plt.grid()
plt.show()
Mechanical Power (Pm W) Vs Motor speed (tr/min)
df2.plot.scatter(x= 'motor_speed',y='Pm',c='torque',figsize=(5,3))
plt.title('Mechanical Power (Pm W) Vs Motor speed (tr/min)')
plt.grid()
plt.show()
The correlation matrix
corr=df2.corr()
plt.figure(figsize=(5,5))
sns.heatmap(data=corr,square=True,cmap='bwr')
plt.yticks(rotation=0)
plt.xticks(rotation=90)
plt.title('Data correlation')
plt.show()
'Correlation between the output ('+out+') and the inputs'
'Correlation between the output (pm) and the inputs'
out='pm'
s=corr[out].drop(out)
index_sort=s.abs().sort_values(ascending=False).index
s[index_sort].plot.bar(figsize=(5,3))
plt.title('Correlation between the output ('+out+') and the inputs')
plt.xlabel('inputs')
plt.ylabel('Correlation')
plt.grid()
plt.show()
Historgam of the output
out='pm'
hist=df2[out]
wind=2
# The wind is equivalent to the bins_delta
hist=(hist//wind)*wind
# we can approach the histogram by a sample value count
# otherwise we can use np.histogram or other fonction: see the second plot
hist=hist.value_counts().sort_index()
hist.plot.area(figsize=(5,3))
plt.title('Histograme of '+out )
plt.xlabel(out + ' range, delta_bins='+str(wind))
plt.ylabel('Count')
plt.grid()
plt.show()
Density PDF and CDF of the output
out='pm'
# Color plot
r='#b7190f'
b='#26619c'
# PDF and CDF calculation
pdf, bins = np.histogram(df2[out], bins=60, density=True)
bins = 0.5 * (bins[:-1] + bins[1:])
cdf=integrate.cumtrapz(pdf, bins, initial=0)
# Create a new figure with two y-axes
fig, ax1 = plt.subplots(figsize=(5,3))
ax2 = ax1.twinx()
# ax1 PDF
ax1.plot(bins, pdf, label='pdf', marker='.',c=r)
ax1.set_xlabel(out + ' range, bins=100')
ax1.set_ylabel('Pdf of '+out,c=r)
ax1.grid(which='major', color=r, linewidth=1,alpha=0.2)
ax1.tick_params(axis='y', colors=r)
# ax2 CDF
ax2.plot(bins, cdf, label='Cdf', marker='.',c=b)
ax2.set_ylabel('Cdf of '+out,c=b)
ax2.grid(which='major', color=b, linewidth=1,alpha=0.2)
ax2.tick_params(axis='y', colors=b)
plt.suptitle('Pdf and Cdf of ' + out)
fig.legend(loc='upper left')
plt.show()
# Color plot
r='#b7190f'
b='#26619c'
# Create a new figure with two y-axes
fig, ax1 = plt.subplots(figsize=(18, 3))
ax2 = ax1.twinx()
# ax1 Pm
ax1.plot(df2.Pm,label='Pm',c=b)
ax1.set_xlabel('data rows')
ax1.set_ylabel('Pm(w)',c=r)
ax1.grid(which='major', color=r, linewidth=1,alpha=0.2)
ax1.tick_params(axis='y', colors=r)
# ax2 sessions
ax2.plot(df2.profile_id,label='profile_id',c=r)
ax2.set_ylabel('Profile_id = session'+out,c=b)
ax2.grid(which='major', color=b, linewidth=1,alpha=0.2)
ax2.tick_params(axis='y', colors=b)
plt.suptitle('Mechanical power vs session Id')
plt.legend()
plt.grid()
In general cases, we use the function sklearn.model_selection.train_test_split to split train/test set, but in our case we cant use it because the below points:
The session id
np.sort(df2.profile_id.iloc[int(6e5):].unique())
array([16, 36, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76, 78, 79, 80, 81], dtype=int64)
The spliting funciton
for more details about this funciton: split_with_groupes
please check the utils package
df_train,df_test=split_with_groupes(df=df2,groups_col='profile_id',test_size=0.2,seed=0,n_ether=10)
len(df_test)/len(df2),len(df_train)/len(df2)
(0.200810630470328, 0.799189369529672)
df_train2=df_train.reset_index(drop=True)
df_test2=df_test.reset_index(drop=True)
# Color plot
r='#b7190f'
b='#26619c'
# Create a new figure with two y-axes
fig, ax1 = plt.subplots(figsize=(18, 3))
ax2 = ax1.twinx()
# ax1 Pm
ax1.plot(df_train2.Pm,label='Pm',c=b)
ax1.set_xlabel('data rows')
ax1.set_ylabel('Pm(w)',c=r)
ax1.grid(which='major', color=r, linewidth=1,alpha=0.2)
ax1.tick_params(axis='y', colors=r)
# ax2 sessions
ax2.plot(df_train2.profile_id,label='profile_id',c=r)
ax2.set_ylabel('Profile_id = session'+out,c=b)
ax2.grid(which='major', color=b, linewidth=1,alpha=0.2)
ax2.tick_params(axis='y', colors=b)
plt.suptitle('Mechanical power vs session Id')
plt.legend()
plt.grid()
# Color plot
r='#b7190f'
b='#26619c'
# Create a new figure with two y-axes
fig, ax1 = plt.subplots(figsize=(18, 3))
ax2 = ax1.twinx()
# ax1 Pm
ax1.plot(df_test2.Pm,label='Pm',c=b)
ax1.set_xlabel('data rows')
ax1.set_ylabel('Pm(w)',c=r)
ax1.grid(which='major', color=r, linewidth=1,alpha=0.2)
ax1.tick_params(axis='y', colors=r)
# ax2 sessions
ax2.plot(df_test2.profile_id,label='profile_id',c=r)
ax2.set_ylabel('Profile_id = session'+out,c=b)
ax2.grid(which='major', color=b, linewidth=1,alpha=0.2)
ax2.tick_params(axis='y', colors=b)
plt.suptitle('Mechanical power vs session Id')
plt.legend()
plt.grid()
df2.columns
Index(['u_q', 'coolant', 'stator_winding', 'u_d', 'stator_tooth',
'motor_speed', 'i_d', 'i_q', 'pm', 'stator_yoke', 'ambient', 'torque',
'profile_id', 'us', 'is', 'Pm', 'Se'],
dtype='object')
colsx=['u_q', 'coolant', 'stator_winding', 'u_d', 'stator_tooth',
'motor_speed', 'i_d', 'i_q', 'stator_yoke', 'ambient', 'torque',
'profile_id','us', 'is', 'Se', 'Pm']
X_train=df_train[colsx]
y_train=df_train[ 'pm' ]
X_test=df_test[colsx]
y_test=df_test[ 'pm' ]
X_train.shape,X_test.shape,y_train.shape,y_test.shape
((1063574, 16), (267242, 16), (1063574,), (267242,))
Save the data
np.savez(r'../data/processed/No_normalized_data_V3.npz',
X_train=X_train.values,
y_train= y_train.values,
X_test=X_test.values,
y_test= y_test.values,
colsx=colsx,
colsy=['pm'])
We will start with the linear regression model as a simple/fast algorithmic justice for comparison
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import joblib
import copy
from sklearn.inspection import permutation_importance
from utils import *
load=np.load(r'../data/processed/No_normalized_data_V3.npz')
keys=load.files
X_train, y_train, X_test, y_test= [load[k] for k in keys[:4]]
colsx=load['colsx']
X_train.shape , y_train.shape, X_test.shape, y_test.shape
((1063574, 16), (1063574,), (267242, 16), (267242,))
First_time=False
# If the First_time = True, we will initialize the model, fit it, and saved it
# Otherwise, we will just load the pretrained model
if First_time:
reg = LinearRegression().fit(X_train, y_train)
joblib.dump(reg, "../models/LinReg/LinReg_V0.joblib")
reg2=copy.copy(reg)
else:
reg2 = joblib.load("../models/LinReg/LinReg_V0.joblib")
yh_train=reg2.predict(X_train)
yh_test=reg2.predict(X_test)
Train/Test set Metrics
print ('_'*20,'\n','Train set')
print(local_metrics(y_train, yh_train))
print ('_'*20,'\n','Test set')
print(local_metrics(y_test, yh_test))
____________________
Train set
{'r2_score': 0.8509728173767069, 'MSE': 52.41993356822764, 'RMSE': 7.240161156233171, 'NMSE': 0.013107461066154206}
____________________
Test set
{'r2_score': 0.8409277832842572, 'MSE': 52.09309754904043, 'RMSE': 7.217554817875679, 'NMSE': 0.017792183457083922}
Real vs prediciton plot
plot_prediction_real(y_train,yh_train,label='train set',offset=10,\
fig_size=(5,5))
plot_prediction_real(y_test,yh_test,label='test set',fig_size=(5,5))
Time serie
plot_y_yh_time(y_train,yh_train,title='Train set: time serie, real vs prediction',\
fig_size=(10,4))
plot_y_yh_time(y_test,yh_test,title='Test set: time serie, real vs prediction',\
fig_size=(10,4))
Error histogram
_=plot_error(y_train, yh_train, y_test, yh_test)
{'test': {'mean': -1.9822191989024145, 'std': 6.940021944961204}, 'train': {'mean': 4.4405223213114105e-14, 'std': 7.24016115623317}}
Plot the features importance
P_I=permutation_importance(estimator=reg, X=X_train, y=y_train,
scoring='r2', n_repeats=5, random_state=0)
means=P_I.importances_mean
std=P_I.importances_std
s=pd.Series(np.abs(means),index=colsx)
s=s.sort_values()
s.plot.bar(yerr=std)
plt.grid()
plt.title('Feature importances - LinearRegression')
plt.show()
We will use the random search to tune the hyperparameters of the random forest model
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import HistGradientBoostingClassifier
import joblib
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_squared_error as MSE
from sklearn.metrics import r2_score as R2
import matplotlib.pyplot as plt
from utils import *
load=np.load(r'../data/processed/No_normalized_data_V3.npz')
keys=load.files
X_train, y_train, X_test, y_test= [load[k] for k in keys[:4]]
colsx=load['colsx']
X_train.shape , y_train.shape, X_test.shape, y_test.shape
((1063574, 16), (1063574,), (267242, 16), (267242,))
Pipeline
model = Pipeline(
[
(
"model",
RandomForestRegressor(criterion='squared_error', verbose=1,n_jobs=-1,random_state=0)
),
]
)
Input shape
n_inp=X_train.shape[1]
print(n_inp)
16
Random Search parameters
param_distributions = {
"model__n_estimators": loguniform_int(100, 500), # default 100
"model__min_samples_split": uniform_int(1, 5), # default 2
"model__max_depth": loguniform_int(20, 150), # default None
"model__min_samples_leaf": loguniform_int(1, 5), # default 1
"model__max_features":uniform_int(5, n_inp) # default 1.0
}
model_random_search = RandomizedSearchCV(
model,
param_distributions=param_distributions,
n_iter=15, # number of different combinations to try,
cv=4,# number of folds to use for cross validation: 5 mean 25% for val and 75% for test
verbose=1,
)
Fit
First_time=False
# If the First_time = True, we will initialize the model, fit it, and saved it
# Otherwise, we will just load the pretrained model
if First_time:
# fit the model
model_random_search.fit(X_train, y_train)
# Save the model
joblib.dump(model_random_search, r'../models/RandForest/RandSear_RandFor_V1.pkl')
else:
# Load the model
model_random_search2=joblib.load(r'../models/RandForest/RandSear_RandFor_V1.pkl')
#Make a beep to notify the end of model training
import winsound
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 2000 # Set Duration To 1000 ms == 1 second
winsound.Beep(frequency, duration)
The best estimator and the best parameters
'''Resultat
RandomForestRegressor(max_depth=68, max_features=5, min_samples_leaf=3,
min_samples_split=5, n_estimators=276, n_jobs=-1,
random_state=0, verbose=1)
'''
RF_best=model_random_search2.best_estimator_ ['model']
RF_best
RandomForestRegressor(max_depth=68, max_features=5, min_samples_leaf=3,
min_samples_split=5, n_estimators=276, n_jobs=-1,
random_state=0, verbose=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RandomForestRegressor(max_depth=68, max_features=5, min_samples_leaf=3,
min_samples_split=5, n_estimators=276, n_jobs=-1,
random_state=0, verbose=1)'''
Resultat:
{'model__max_depth': 68,
'model__max_features': 5,
'model__min_samples_leaf': 3,
'model__min_samples_split': 5,
'model__n_estimators': 276}
'''
model_random_search2.best_params_
{'model__max_depth': 68,
'model__max_features': 5,
'model__min_samples_leaf': 3,
'model__min_samples_split': 5,
'model__n_estimators': 276}
yh_train=RF_best.predict(X_train)
yh_test=RF_best.predict(X_test)
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=8)]: Done 34 tasks | elapsed: 0.6s [Parallel(n_jobs=8)]: Done 184 tasks | elapsed: 3.6s [Parallel(n_jobs=8)]: Done 276 out of 276 | elapsed: 5.4s finished [Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=8)]: Done 34 tasks | elapsed: 0.0s [Parallel(n_jobs=8)]: Done 184 tasks | elapsed: 0.6s [Parallel(n_jobs=8)]: Done 276 out of 276 | elapsed: 0.9s finished
Train/Test set Metrics
print ('_'*20,'\n','Train set')
print(local_metrics(y_train, yh_train))
print ('_'*20,'\n','Test set')
print(local_metrics(y_test, yh_test))
____________________
Train set
{'r2_score': 0.9999009769749209, 'MSE': 0.03483109795810131, 'RMSE': 0.18663091372573118, 'NMSE': 8.709420811893875e-06}
____________________
Test set
{'r2_score': 0.8775015737002211, 'MSE': 40.1158832295746, 'RMSE': 6.333710068322878, 'NMSE': 0.013701415111505402}
Real vs prediciton plot
y_train, yh_train
(array([25.89539337, 25.88982391, 25.88938141, ..., 62.13838736,
62.13342152, 62.13142866]),
array([25.74892074, 25.80154233, 25.81084665, ..., 62.05531916,
62.08200314, 61.92952095]))
plot_prediction_real(y_train,yh_train,label='train set',offset=10,\
fig_size=(5,5))
plot_prediction_real(y_test,yh_test,label='test set',fig_size=(5,5))
Time serie
plot_y_yh_time(y_train,yh_train,title='Train set: time serie, real vs prediction',\
fig_size=(10,4))
plot_y_yh_time(y_test,yh_test,title='Test set: time serie, real vs prediction',\
fig_size=(10,4))
Error histogram
_=plot_error(y_train, yh_train, y_test, yh_test)
{'test': {'mean': -1.0235385796178915, 'std': 6.250460143510107}, 'train': {'mean': -0.00014658345299136245, 'std': 0.1866308561610128}}
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.models import load_model
from sklearn.inspection import permutation_importance
import matplotlib.pyplot as plt
import json
from utils import *
load=np.load(r'../data/processed/No_normalized_data_V3.npz')
keys=load.files
X_train0, y_train0, X_test0, y_test0= [load[k] for k in keys[:4]]
colsx=load['colsx']
X_train0.shape , y_train0.shape, X_test0.shape, y_test0.shape
((1063574, 16), (1063574,), (267242, 16), (267242,))
The data was recorded with a sampling frequency of 2Hz (sampling time of 0.5s). In this problem, we tray to predict the rotor temperature, the temperature inertia of the rotor is order of minute, so I propose to undersampling the data from Ts=500 ms to Ts=5s to fast the training of the DNN
X_train=X_train0[::10]
X_test=X_test0[::10]
y_train=y_train0[::10]
y_test=y_test0[::10]
X_train.shape , y_train.shape, X_test.shape, y_test.shape
((106358, 16), (106358,), (26725, 16), (26725,))
# Input normalization
Mean=X_train.mean(axis=0)
Xn_train=X_train-Mean
STD=Xn_train.std(axis=0)
Xn_train= Xn_train / STD
Xn_test=X_test-Mean
Xn_test= Xn_test / STD
# check the normalization of the data: mean must be close to 0
np.abs(Xn_train.mean(axis=0)).max(), np.abs(Xn_test.mean(axis=0)).max()
(3.7625535341026386e-16, 0.3791241789891576)
# check the normalization of the data: STD must be close to 1
np.abs(Xn_train.std(axis=0)).max(), np.abs(Xn_test.std(axis=0)).max()
(1.0, 1.1727066598738511)
# Output normalization
# We will use a tanh as activation function of the output layer
# So I propose to centralize the output in the range [-0.9,0.9]
# [-0.9,0.9] avoid the extraime limite of the tanh [-1,1]
# In those limite the derivation is 0, so it is slow to train data when
# the output is very close to -1/1
y_min=y_train.min()
y_max=y_train.max()
yn_train=1.8*((y_train-y_min)/(y_max-y_min)-0.5)
# check the normalization of the train set
yn_train.min(), yn_train.max()
(-0.9, 0.9)
yn_test=1.8*((y_test-y_min)/(y_max-y_min)-0.5)
# check the normalization of the test set
yn_test.min(), yn_test.max()
(-0.8963086086661727, 0.3961655575613971)
the max of yn_test is 0.396 instead of 0.9 because the max of y_test is 87°C and the max of y_train is 113°C: see below
y_test.min(),y_train.min(), y_test.max(),y_train.max()
(21.04705619812012, 20.856956481933597, 87.60704671625562, 113.55357360839844)
# Creat the model
model = Sequential()
model.add(Input(shape=(X_train.shape[1],)))
for i in range(7):
model.add(Dense(12,activation='relu'))
model.add(Dense(1,activation='tanh'))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 12) 204
dense_1 (Dense) (None, 12) 156
dense_2 (Dense) (None, 12) 156
dense_3 (Dense) (None, 12) 156
dense_4 (Dense) (None, 12) 156
dense_5 (Dense) (None, 12) 156
dense_6 (Dense) (None, 12) 156
dense_7 (Dense) (None, 1) 13
=================================================================
Total params: 1,153
Trainable params: 1,153
Non-trainable params: 0
_________________________________________________________________
# This checkpoint will save an intermediate model each 10 epochs
checkpoint = ModelCheckpoint('model_DNN_{epoch:03d}.h5', period=10,verbose=1)
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
# Compile the model using Adam optimizer
model.compile(optimizer=Adam(learning_rate=1e-3), loss='mse' )
# Train the model
history=model.fit(Xn_train, yn_train, epochs=20, batch_size=64,
callbacks=[checkpoint], validation_data=(Xn_test,yn_test))
Epoch 1/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0250 - val_loss: 0.0296 Epoch 2/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0164 - val_loss: 0.0221 Epoch 3/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0144 - val_loss: 0.0192 Epoch 4/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0133 - val_loss: 0.0166 Epoch 5/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0126 - val_loss: 0.0194 Epoch 6/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0122 - val_loss: 0.0204 Epoch 7/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0118 - val_loss: 0.0189 Epoch 8/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0114 - val_loss: 0.0170 Epoch 9/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0111 - val_loss: 0.0164 Epoch 10/20 1639/1662 [============================>.] - ETA: 0s - loss: 0.0108 Epoch 10: saving model to model_LSTM_010.h5 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0108 - val_loss: 0.0159 Epoch 11/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0105 - val_loss: 0.0175 Epoch 12/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0103 - val_loss: 0.0182 Epoch 13/20 1662/1662 [==============================] - 3s 2ms/step - loss: 0.0101 - val_loss: 0.0161 Epoch 14/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0098 - val_loss: 0.0173 Epoch 15/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0097 - val_loss: 0.0174 Epoch 16/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0095 - val_loss: 0.0174 Epoch 17/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0095 - val_loss: 0.0169 Epoch 18/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0093 - val_loss: 0.0176 Epoch 19/20 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0092 - val_loss: 0.0155 Epoch 20/20 1626/1662 [============================>.] - ETA: 0s - loss: 0.0091 Epoch 20: saving model to model_LSTM_020.h5 1662/1662 [==============================] - 2s 1ms/step - loss: 0.0091 - val_loss: 0.0191
First_time=False
# If the First_time = True, we will initialize the model, fit it, and saved it
# Otherwise, we will just load the pretrained model
if First_time:
# Save the model
model.save(r'../models/DNN/model_DNN_final.h5')
history=history.history
json.dump( history, open( "../models/DNN/history.json", 'w' ) )
else:
# Load the model
model2 = load_model(r'../models/DNN/model_DNN_final.h5')
history = json.load( open( "../models/DNN/history.json") )
Make a beep to notify the end of model training
import winsound
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 2000 # Set Duration To 1000 ms == 1 second
winsound.Beep(frequency, duration)
Plot the loss of the prediciton
plt.plot(history['loss'],label='train')
plt.plot(history['val_loss'],label='test')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.title('Train/Test loss')
plt.legend()
plt.grid()
plt.show()
ynh_train=model2.predict(Xn_train)
ynh_test=model2.predict(Xn_test)
3324/3324 [==============================] - 3s 840us/step 836/836 [==============================] - 1s 882us/step
From the normalized output to the °C
yh_test=(ynh_test.flatten()/1.8+0.5)*(y_max-y_min)+y_min
yh_train=(ynh_train.flatten()/1.8+0.5)*(y_max-y_min)+y_min
Train/Test set Metrics
print ('_'*20,'\n','Train set')
print(local_metrics(y_train, yh_train))
print ('_'*20,'\n','Test set')
print(local_metrics(y_test, yh_test))
____________________
Train set
{'r2_score': 0.9319059301052521, 'MSE': 23.95176822640016, 'RMSE': 4.8940543750963945, 'NMSE': 0.005988927218422694}
____________________
Test set
{'r2_score': 0.8455059571217549, 'MSE': 50.596522986960906, 'RMSE': 7.113123293389544, 'NMSE': 0.01728099215889291}
Real vs prediciton plot
y_train, yh_train
(array([25.89539337, 25.90344429, 25.95416451, ..., 62.27448128,
62.21632837, 62.14264575]),
array([28.247816, 28.550337, 28.454178, ..., 59.46094 , 57.67937 ,
57.303215], dtype=float32))
plot_prediction_real(y_train,yh_train,label='train set',offset=10,\
fig_size=(5,5))
plot_prediction_real(y_test,yh_test,label='test set',fig_size=(5,5))
Time serie
plot_y_yh_time(y_train,yh_train,title='Train set: time serie, real vs prediction',\
fig_size=(10,4))
plot_y_yh_time(y_test,yh_test,title='Test set: time serie, real vs prediction',\
fig_size=(10,4))
Error histogram
_=plot_error(y_train, yh_train, y_test, yh_test)
{'test': {'mean': -1.7818939510828322, 'std': 6.886318097071564}, 'train': {'mean': -0.6292989497877967, 'std': 4.853426733576612}}
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Input,LSTM, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.models import load_model
from sklearn.inspection import permutation_importance
import matplotlib.pyplot as plt
import json
from utils import *
load=np.load(r'../data/processed/No_normalized_data_V3.npz')
keys=load.files
X_train0, y_train0, X_test0, y_test0= [load[k] for k in keys[:4]]
colsx=load['colsx']
X_train0.shape , y_train0.shape, X_test0.shape, y_test0.shape
((1063574, 16), (1063574,), (267242, 16), (267242,))
The data was recorded with a sampling frequency of 2Hz (sampling time of 0.5s). In this problem, we tray to predict the rotor temperature, the temperature inertia of the rotor is order of minute, so I propose to undersampling the data from Ts=500 ms to Ts=5s to fast the training of the DNN
X_train=X_train0[::10]
X_test=X_test0[::10]
y_train=y_train0[::10]
y_test=y_test0[::10]
X_train.shape , y_train.shape, X_test.shape, y_test.shape
((106358, 16), (106358,), (26725, 16), (26725,))
# Input normalization
Mean=X_train.mean(axis=0)
Xn_train=X_train-Mean
STD=Xn_train.std(axis=0)
Xn_train= Xn_train / STD
Xn_test=X_test-Mean
Xn_test= Xn_test / STD
# check the normalization of the data: mean must be close to 0
np.abs(Xn_train.mean(axis=0)).max(), np.abs(Xn_test.mean(axis=0)).max()
(3.7625535341026386e-16, 0.3791241789891576)
# check the normalization of the data: STD must be close to 1
np.abs(Xn_train.std(axis=0)).max(), np.abs(Xn_test.std(axis=0)).max()
(1.0, 1.1727066598738511)
# Output normalization
# We will use a tanh as activation function of the output layer
# So I propose to centralize the output in the range [-0.9,0.9]
# [-0.9,0.9] avoid the extraime limite of the tanh [-1,1]
# In those limite the derivation is 0, so it is slow to train data when
# the output is very close to -1/1
y_min=y_train.min()
y_max=y_train.max()
yn_train=1.8*((y_train-y_min)/(y_max-y_min)-0.5)
# check the normalization of the train set
yn_train.min(), yn_train.max()
(-0.9, 0.9)
yn_test=1.8*((y_test-y_min)/(y_max-y_min)-0.5)
# check the normalization of the test set
yn_test.min(), yn_test.max()
(-0.8963086086661727, 0.3961655575613971)
the max of yn_test is 0.396 instead of 0.9 because the max of y_test is 87°C and the max of y_train is 113°C: see below
y_test.min(),y_train.min(), y_test.max(),y_train.max()
(21.04705619812012, 20.856956481933597, 87.60704671625562, 113.55357360839844)
def recursive_array(arr, n_window):
arr_out=[]
for i in range(n_window, len(arr)):
arr_out.append(arr[i-n_window:i])
arr_out=np.array(arr_out)
return arr_out
n_window=50 # 50*5s = 250s
Xn3_test=recursive_array(Xn_test, n_window=n_window)
Xn3_train=recursive_array(Xn_train, n_window=n_window)
yn3_train=yn_train[n_window:]
yn3_test=yn_test[n_window:]
Xn_train.shape , yn_train.shape, Xn_test.shape, yn_test.shape
((106358, 16), (106358,), (26725, 16), (26725,))
Xn3_train.shape , yn3_train.shape, Xn3_test.shape, yn3_test.shape
((106308, 50, 16), (106308,), (26675, 50, 16), (26675,))
# Creat the model
model = Sequential()
model.add(LSTM(16, return_sequences=True, input_shape=Xn3_train.shape[1:]))
model.add(Dropout(.2))
model.add(LSTM(16))
model.add(Dropout(.2))
model.add(Dense(16,activation='relu'))
model.add(Dense(1,activation='tanh'))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 50, 16) 2112
dropout (Dropout) (None, 50, 16) 0
lstm_1 (LSTM) (None, 16) 2112
dropout_1 (Dropout) (None, 16) 0
dense (Dense) (None, 16) 272
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 4,513
Trainable params: 4,513
Non-trainable params: 0
_________________________________________________________________
# This checkpoint will save an intermediate model each 10 epochs
checkpoint = ModelCheckpoint('model_LSTM_{epoch:03d}.h5', period=10,verbose=1)
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
# Compile the model
model.compile(optimizer=Adam(learning_rate=1e-3), loss='mse' )
# Train the model
history=model.fit(Xn3_train, yn3_train, epochs=20, batch_size=64,
callbacks=[checkpoint], validation_data=(Xn3_test,yn3_test))
Epoch 1/20 1662/1662 [==============================] - 13s 7ms/step - loss: 0.0189 - val_loss: 0.0098 Epoch 2/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0103 - val_loss: 0.0094 Epoch 3/20 1662/1662 [==============================] - 11s 7ms/step - loss: 0.0085 - val_loss: 0.0084 Epoch 4/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0071 - val_loss: 0.0120 Epoch 5/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0060 - val_loss: 0.0097 Epoch 6/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0053 - val_loss: 0.0091 Epoch 7/20 1662/1662 [==============================] - 10s 6ms/step - loss: 0.0049 - val_loss: 0.0103 Epoch 8/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0046 - val_loss: 0.0104 Epoch 9/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0043 - val_loss: 0.0099 Epoch 10/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0039 - val_loss: 0.0094 Epoch 00010: saving model to model_LSTM_010.h5 Epoch 11/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0037 - val_loss: 0.0124 Epoch 12/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0036 - val_loss: 0.0131 Epoch 13/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0034 - val_loss: 0.0098 Epoch 14/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0033 - val_loss: 0.0110 Epoch 15/20 1662/1662 [==============================] - 11s 7ms/step - loss: 0.0031 - val_loss: 0.0096 Epoch 16/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0031 - val_loss: 0.0108 Epoch 17/20 1662/1662 [==============================] - 10s 6ms/step - loss: 0.0030 - val_loss: 0.0098 Epoch 18/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0029 - val_loss: 0.0100 Epoch 19/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0029 - val_loss: 0.0102 Epoch 20/20 1662/1662 [==============================] - 11s 6ms/step - loss: 0.0028 - val_loss: 0.0099 Epoch 00020: saving model to model_LSTM_020.h5
First_time=False
# If the First_time = True, we will initialize the model, fit it, and saved it
# Otherwise, we will just load the pretrained model
if First_time:
# Save the model
model.save(r'../models/LSTM/model_LSTM_final.h5')
history=history.history
json.dump( history, open( "../models/LSTM/history.json", 'w' ) )
else:
# Load the model
model2 = load_model(r'../models/LSTM/model_LSTM_final.h5')
history = json.load( open( "../models/LSTM/history.json") )
Make a beep to notify the end of model training
import winsound
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 2000 # Set Duration To 1000 ms == 1 second
winsound.Beep(frequency, duration)
Plot the loss of the prediciton
fig = plt.figure(figsize=(6,3))
plt.plot(history['loss'],label='train')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.title('Train loss')
plt.legend()
plt.grid()
plt.show()
ynh_train=model2.predict(Xn3_train)
ynh_test=model2.predict(Xn3_test)
3323/3323 [==============================] - 29s 8ms/step 834/834 [==============================] - 7s 8ms/step
From the normalized output to the °C
yh_test=(ynh_test.flatten()/1.8+0.5)*(y_max-y_min)+y_min
yh_train=(ynh_train.flatten()/1.8+0.5)*(y_max-y_min)+y_min
# adapte the outputs to take in account the windaw
# and to have the same size of inputs and outputs
y_train=y_train[n_window:]
y_test=y_test[n_window:]
Train/Test set Metrics
print ('_'*20,'\n','Train set')
print(local_metrics(y_train, yh_train))
print ('_'*20,'\n','Test set')
print(local_metrics(y_test, yh_test))
____________________
Train set
{'r2_score': 0.9668431402709771, 'MSE': 11.651057054092657, 'RMSE': 3.41336447718269, 'NMSE': 0.0029121275239807}
____________________
Test set
{'r2_score': 0.9195336751782257, 'MSE': 26.34358586713694, 'RMSE': 5.13260030268644, 'NMSE': 0.008986687432743007}
Real vs prediciton plot
y_train, yh_train
(array([28.7659359 , 28.75426102, 28.8516655 , ..., 62.27448128,
62.21632837, 62.14264575]),
array([29.180582, 29.36076 , 29.525795, ..., 66.45154 , 66.41196 ,
66.3602 ], dtype=float32))
plot_prediction_real(y_train,yh_train,label='train set',offset=10,\
fig_size=(5,5))
plot_prediction_real(y_test,yh_test,label='test set',fig_size=(5,5))
Time serie
plot_y_yh_time(y_train,yh_train,title='Train set: time serie, real vs prediction',\
fig_size=(10,4))
plot_y_yh_time(y_test,yh_test,title='Test set: time serie, real vs prediction',\
fig_size=(10,4))
Error histogram
_=plot_error(y_train, yh_train, y_test, yh_test)
{'test': {'mean': -1.6058626449807745, 'std': 4.874914464130862}, 'train': {'mean': -1.2347944542939768, 'std': 3.1821910234518445}}
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Input,LSTM, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.models import load_model
from sklearn.inspection import permutation_importance
import matplotlib.pyplot as plt
import json
from utils import *
load=np.load(r'../data/processed/No_normalized_data_V3.npz')
keys=load.files
X_train0, y_train0, X_test0, y_test0= [load[k] for k in keys[:4]]
colsx=load['colsx']
X_train0.shape , y_train0.shape, X_test0.shape, y_test0.shape
((1063574, 16), (1063574,), (267242, 16), (267242,))
The data was recorded with a sampling frequency of 2Hz (sampling time of 0.5s). In this problem, we tray to predict the rotor temperature, the temperature inertia of the rotor is order of minute, so I propose to undersampling the data from Ts=500 ms to Ts=5s to fast the training of the DNN
X_train=X_train0[::10]
X_test=X_test0[::10]
y_train=y_train0[::10]
y_test=y_test0[::10]
X_train.shape , y_train.shape, X_test.shape, y_test.shape
((106358, 16), (106358,), (26725, 16), (26725,))
# Input normalization
Mean=X_train.mean(axis=0)
Xn_train=X_train-Mean
STD=Xn_train.std(axis=0)
Xn_train= Xn_train / STD
Xn_test=X_test-Mean
Xn_test= Xn_test / STD
# check the normalization of the data: mean must be close to 0
np.abs(Xn_train.mean(axis=0)).max(), np.abs(Xn_test.mean(axis=0)).max()
(3.7625535341026386e-16, 0.3791241789891576)
# check the normalization of the data: STD must be close to 1
np.abs(Xn_train.std(axis=0)).max(), np.abs(Xn_test.std(axis=0)).max()
(1.0, 1.1727066598738511)
# Output normalization
# We will use a tanh as activation function of the output layer
# So I propose to centralize the output in the range [-0.9,0.9]
# [-0.9,0.9] avoid the extraime limite of the tanh [-1,1]
# In those limite the derivation is 0, so it is slow to train data when
# the output is very close to -1/1
y_min=y_train.min()
y_max=y_train.max()
yn_train=1.8*((y_train-y_min)/(y_max-y_min)-0.5)
# check the normalization of the train set
yn_train.min(), yn_train.max()
(-0.9, 0.9)
yn_test=1.8*((y_test-y_min)/(y_max-y_min)-0.5)
# check the normalization of the test set
yn_test.min(), yn_test.max()
(-0.8963086086661727, 0.3961655575613971)
the max of yn_test is 0.396 instead of 0.9 because the max of y_test is 87°C and the max of y_train is 113°C: see below
y_test.min(),y_train.min(), y_test.max(),y_train.max()
(21.04705619812012, 20.856956481933597, 87.60704671625562, 113.55357360839844)
def recursive_Xy_session(X,y, n_window,session):
X_out=[]
y_out=[]
for i in range(n_window, len(X)):
if len(np.unique(X[i-n_window:i][:,profile_id]))==1:
X_out.append(X[i-n_window:i])
y_out.append(y[i])
X_out=np.array(X_out)
y_out=np.array(y_out)
return X_out,y_out
profile_id=np.argmax((colsx=='profile_id').astype(int))
print(set(X_train[:,profile_id]))
{2.0, 4.0, 5.0, 6.0, 7.0, 9.0, 11.0, 12.0, 14.0, 15.0, 16.0, 18.0, 19.0, 20.0, 21.0, 23.0, 24.0, 26.0, 27.0, 29.0, 30.0, 31.0, 32.0, 41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 51.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 61.0, 62.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0, 71.0, 73.0, 74.0, 75.0, 78.0, 79.0, 81.0}
session_train=X_train[:,profile_id]
session_test=X_test[:,profile_id]
session_train
array([ 5., 5., 5., ..., 71., 71., 71.])
n_win=50 # 50*5s = 250s
Xn3_test,yn3_test =recursive_Xy_session(Xn_test,yn_test, n_window=n_win,session=session_train)
Xn3_train, yn3_train=recursive_Xy_session(Xn_train,yn_train, n_window=n_win,session=session_test)
Xn_train.shape , yn_train.shape, Xn_test.shape, yn_test.shape
((106358, 16), (106358,), (26725, 16), (26725,))
Xn3_train.shape , yn3_train.shape, Xn3_test.shape, yn3_test.shape
((103662, 50, 16), (103662,), (26038, 50, 16), (26038,))
model = Sequential()
model.add(LSTM(16, return_sequences=True, input_shape=Xn3_train.shape[1:]))
model.add(Dropout(.2))
model.add(LSTM(16))
model.add(Dropout(.2))
model.add(Dense(16,activation='relu'))
model.add(Dense(1,activation='tanh'))
# This checkpoint will save an intermediate model each 10 epochs
checkpoint = ModelCheckpoint('model_LSTM_{epoch:03d}.h5', period=10,verbose=1)
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
# Compile the model
model.compile(optimizer=Adam(learning_rate=1e-3), loss='mse' )
# Train the model
history=model.fit(Xn3_train, yn3_train, epochs=20, batch_size=64,
callbacks=[checkpoint], validation_data=(Xn3_test,yn3_test))
Epoch 1/20 1620/1620 [==============================] - 47s 28ms/step - loss: 0.0163 - val_loss: 0.0104 Epoch 2/20 1620/1620 [==============================] - 48s 30ms/step - loss: 0.0088 - val_loss: 0.0100 Epoch 3/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0069 - val_loss: 0.0124 Epoch 4/20 1620/1620 [==============================] - 46s 28ms/step - loss: 0.0059 - val_loss: 0.0129 Epoch 5/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0052 - val_loss: 0.0121 Epoch 6/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0047 - val_loss: 0.0117 Epoch 7/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0044 - val_loss: 0.0114 Epoch 8/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0040 - val_loss: 0.0131 Epoch 9/20 1620/1620 [==============================] - 46s 28ms/step - loss: 0.0037 - val_loss: 0.0122 Epoch 10/20 1620/1620 [==============================] - ETA: 0s - loss: 0.0035 Epoch 10: saving model to model_LSTM_010.h5 1620/1620 [==============================] - 46s 29ms/step - loss: 0.0035 - val_loss: 0.0129 Epoch 11/20 1620/1620 [==============================] - 46s 28ms/step - loss: 0.0034 - val_loss: 0.0107 Epoch 12/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0033 - val_loss: 0.0104 Epoch 13/20 1620/1620 [==============================] - 48s 30ms/step - loss: 0.0031 - val_loss: 0.0100 Epoch 14/20 1620/1620 [==============================] - 50s 31ms/step - loss: 0.0030 - val_loss: 0.0099 Epoch 15/20 1620/1620 [==============================] - 45s 28ms/step - loss: 0.0030 - val_loss: 0.0114 Epoch 16/20 1620/1620 [==============================] - 46s 28ms/step - loss: 0.0028 - val_loss: 0.0121 Epoch 17/20 1620/1620 [==============================] - 51s 32ms/step - loss: 0.0027 - val_loss: 0.0159 Epoch 18/20 1620/1620 [==============================] - 47s 29ms/step - loss: 0.0026 - val_loss: 0.0137 Epoch 19/20 1620/1620 [==============================] - 47s 29ms/step - loss: 0.0025 - val_loss: 0.0148 Epoch 20/20 1619/1620 [============================>.] - ETA: 0s - loss: 0.0024 Epoch 20: saving model to model_LSTM_020.h5 1620/1620 [==============================] - 46s 28ms/step - loss: 0.0024 - val_loss: 0.0188
First_time=False
# If the First_time = True, we will initialize the model, fit it, and saved it
# Otherwise, we will just load the pretrained model
if First_time:
# Save the model
model.save(r'../models/LSTM_session/model_LSTM_final.h5')
history=history.history
json.dump( history, open( "../models/LSTM_session/history.json", 'w' ) )
else:
# Load the model
model2 = load_model(r'../models/LSTM_session/model_LSTM_final.h5')
history = json.load( open( "../models/LSTM_session/history.json", 'r' ) )
Make a beep to notify the end of model training
import winsound
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 2000 # Set Duration To 1000 ms == 1 second
winsound.Beep(frequency, duration)
Plot the loss of the prediciton
fig = plt.figure(figsize=(6,3))
plt.plot(history['loss'],label='train')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.title('Train loss')
plt.legend()
plt.grid()
plt.show()
ynh_train=model2.predict(Xn3_train)
ynh_test=model2.predict(Xn3_test)
3240/3240 [==============================] - 22s 6ms/step 814/814 [==============================] - 6s 8ms/step
From the normalized output to the °C
yh_test=(ynh_test.flatten()/1.8+0.5)*(y_max-y_min)+y_min
yh_train=(ynh_train.flatten()/1.8+0.5)*(y_max-y_min)+y_min
# adapte the outputs to take in account the windaw
# and to have the same size of inputs and outputs
y_train=(yn3_train/1.8+0.5)*(y_max-y_min)+y_min
y_test=(yn3_test/1.8+0.5)*(y_max-y_min)+y_min
Train/Test set Metrics
print ('_'*20,'\n','Train set')
print(local_metrics(y_train, yh_train))
print ('_'*20,'\n','Test set')
print(local_metrics(y_test, yh_test))
____________________
Train set
{'r2_score': 0.9670497054302775, 'MSE': 11.1632160396884, 'RMSE': 3.3411399311744487, 'NMSE': 0.0027476483309007817}
____________________
Test set
{'r2_score': 0.9211842386327109, 'MSE': 25.63767626638323, 'RMSE': 5.063366100370704, 'NMSE': 0.008630940925311264}
Real vs prediciton plot
y_train, yh_train
(array([28.7659359 , 28.75426102, 28.8516655 , ..., 62.27448128,
62.21632837, 62.14264575]),
array([29.180582, 29.36076 , 29.525795, ..., 66.45154 , 66.41196 ,
66.3602 ], dtype=float32))
plot_prediction_real(y_train,yh_train,label='train set',offset=10,\
fig_size=(5,5))
plot_prediction_real(y_test,yh_test,label='test set',fig_size=(5,5))
Time serie
plot_y_yh_time(y_train,yh_train,title='Train set: time serie, real vs prediction',\
fig_size=(10,4))
plot_y_yh_time(y_test,yh_test,title='Test set: time serie, real vs prediction',\
fig_size=(10,4))
Error histogram
_=plot_error(y_train, yh_train, y_test, yh_test)
{'test': {'mean': -1.5537948017744347, 'std': 4.8190660900595645}, 'train': {'mean': -1.1761755081762932, 'std': 3.1272715286099864}}
Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from utils import *
The metrics of each model
Organize the metrics of each model in a dictionary
dic={
'LinearRegression': {
'Train set':
{'r2_score': 0.8509728173767069, 'MSE': 52.41993356822764, 'RMSE': 7.240161156233171, 'NMSE': 0.013107461066154206}
,'Test set':
{'r2_score': 0.8409277832842572, 'MSE': 52.09309754904043, 'RMSE': 7.217554817875679, 'NMSE': 0.017792183457083922}
}
,'RandForest':{
'Train set':
{'r2_score': 0.9999009769749209, 'MSE': 0.03483109795810131, 'RMSE': 0.18663091372573118, 'NMSE': 8.709420811893875e-06}
,'Test set':
{'r2_score': 0.8775015737002211, 'MSE': 40.1158832295746, 'RMSE': 6.333710068322878, 'NMSE': 0.013701415111505402}
}
,'DNN': {
'Train set':
{'r2_score': 0.9319059301052521, 'MSE': 23.95176822640016, 'RMSE': 4.8940543750963945, 'NMSE': 0.005988927218422694}
,'Test set':
{'r2_score': 0.8455059571217549, 'MSE': 50.596522986960906, 'RMSE': 7.113123293389544, 'NMSE': 0.01728099215889291}
}
,'LSTM':{
'Train set':
{'r2_score': 0.9668431402709771, 'MSE': 11.651057054092657, 'RMSE': 3.41336447718269, 'NMSE': 0.0029121275239807}
,'Test set':
{'r2_score': 0.9195336751782257, 'MSE': 26.34358586713694, 'RMSE': 5.13260030268644, 'NMSE': 0.008986687432743007}
}
,'LSTM_session':{
'Train set':
{'r2_score': 0.9670497054302775, 'MSE': 11.1632160396884, 'RMSE': 3.3411399311744487, 'NMSE': 0.0027476483309007817}
,'Test set':
{'r2_score': 0.9211842386327109, 'MSE': 25.63767626638323, 'RMSE': 5.063366100370704, 'NMSE': 0.008630940925311264}
}
}
Prepare the data Frame
df_comp = pd.DataFrame.from_dict({(i,j): dic[i][j]
for i in dic.keys()
for j in dic[i].keys()},
orient='index')
df_comp
| r2_score | MSE | RMSE | NMSE | ||
|---|---|---|---|---|---|
| LinearRegression | Train set | 0.850973 | 52.419934 | 7.240161 | 0.013107 |
| Test set | 0.840928 | 52.093098 | 7.217555 | 0.017792 | |
| RandForest | Train set | 0.999901 | 0.034831 | 0.186631 | 0.000009 |
| Test set | 0.877502 | 40.115883 | 6.333710 | 0.013701 | |
| DNN | Train set | 0.931906 | 23.951768 | 4.894054 | 0.005989 |
| Test set | 0.845506 | 50.596523 | 7.113123 | 0.017281 | |
| LSTM | Train set | 0.966843 | 11.651057 | 3.413364 | 0.002912 |
| Test set | 0.919534 | 26.343586 | 5.132600 | 0.008987 | |
| LSTM_session | Train set | 0.967050 | 11.163216 | 3.341140 | 0.002748 |
| Test set | 0.921184 | 25.637676 | 5.063366 | 0.008631 |
Only Train set
df_comp_train=df_comp.loc[(slice(None), 'Train set'), :]
df_comp_train
| r2_score | MSE | RMSE | NMSE | ||
|---|---|---|---|---|---|
| LinearRegression | Train set | 0.850973 | 52.419934 | 7.240161 | 0.013107 |
| RandForest | Train set | 0.999901 | 0.034831 | 0.186631 | 0.000009 |
| DNN | Train set | 0.931906 | 23.951768 | 4.894054 | 0.005989 |
| LSTM | Train set | 0.966843 | 11.651057 | 3.413364 | 0.002912 |
| LSTM_session | Train set | 0.967050 | 11.163216 | 3.341140 | 0.002748 |
Only Test set
df_comp_test=df_comp.loc[(slice(None), 'Test set'), :]
df_comp_test
| r2_score | MSE | RMSE | NMSE | ||
|---|---|---|---|---|---|
| LinearRegression | Test set | 0.840928 | 52.093098 | 7.217555 | 0.017792 |
| RandForest | Test set | 0.877502 | 40.115883 | 6.333710 | 0.013701 |
| DNN | Test set | 0.845506 | 50.596523 | 7.113123 | 0.017281 |
| LSTM | Test set | 0.919534 | 26.343586 | 5.132600 | 0.008987 |
| LSTM_session | Test set | 0.921184 | 25.637676 | 5.063366 | 0.008631 |
Metric plots
dfplot=df_comp['r2_score'].to_frame().reset_index(names=['models','train_test'])
dfplot
| models | train_test | r2_score | |
|---|---|---|---|
| 0 | LinearRegression | Train set | 0.850973 |
| 1 | LinearRegression | Test set | 0.840928 |
| 2 | RandForest | Train set | 0.999901 |
| 3 | RandForest | Test set | 0.877502 |
| 4 | DNN | Train set | 0.931906 |
| 5 | DNN | Test set | 0.845506 |
| 6 | LSTM | Train set | 0.966843 |
| 7 | LSTM | Test set | 0.919534 |
| 8 | LSTM_session | Train set | 0.967050 |
| 9 | LSTM_session | Test set | 0.921184 |
# Create sns bar plot
plt.figure(figsize=(8, 3))
sns.barplot(x='models', y='r2_score', hue='train_test', data=dfplot,width=0.4)
plt.legend( loc='lower right')
plt.grid()
plt.ylim([0.65,1])
plt.title("Models comparison, R2-score")
plt.show()
dfplot=df_comp['NMSE'].to_frame().reset_index(names=['models','train_test'])
# Create sns bar plot
plt.figure(figsize=(8, 3))
sns.barplot(x='models', y='NMSE', hue='train_test', data=dfplot,width=0.4)
plt.legend( loc='upper right')
plt.grid()
#plt.ylim([0.65,1])
plt.title("Models comparison, Normalized MSE")
plt.show()
dfplot=df_comp['MSE'].to_frame().reset_index(names=['models','train_test'])
# Create sns bar plot
plt.figure(figsize=(8, 3))
sns.barplot(x='models', y='MSE', hue='train_test', data=dfplot,width=0.4)
plt.legend( loc='upper right')
plt.grid()
#plt.ylim([0.65,1])
plt.title("Models comparison, Normalized MSE")
plt.show()
Organize the error mean/std of each model in a dictionary
dic={
'LinearRegression':
{'test': {'mean': -1.9822191989024145, 'std': 6.940021944961204}, 'train': {'mean': 4.4405223213114105e-14, 'std': 7.24016115623317}}
,'RandForest':
{'test': {'mean': -1.0235385796178915, 'std': 6.250460143510107}, 'train': {'mean': -0.00014658345299136245, 'std': 0.1866308561610128}}
,'DNN':
{'test': {'mean': -1.7818939510828322, 'std': 6.886318097071564}, 'train': {'mean': -0.6292989497877967, 'std': 4.853426733576612}}
,'LSTM':
{'test': {'mean': -1.6058626449807745, 'std': 4.874914464130862}, 'train': {'mean': -1.2347944542939768, 'std': 3.1821910234518445}}
,'LSTM_session':
{'test': {'mean': -1.5537948017744347, 'std': 4.8190660900595645}, 'train': {'mean': -1.1761755081762932, 'std': 3.1272715286099864}}
}
Prepare the data Frame
# dict => df
df_comp = pd.DataFrame.from_dict({(i,j): dic[i][j]
for i in dic.keys()
for j in dic[i].keys()},
orient='index')
#calcul 3*sdt of each model
df_comp['_3_sdt']=df_comp['std']*3
df_comp
| mean | std | _3_sdt | ||
|---|---|---|---|---|
| LinearRegression | test | -1.982219e+00 | 6.940022 | 20.820066 |
| train | 4.440522e-14 | 7.240161 | 21.720483 | |
| RandForest | test | -1.023539e+00 | 6.250460 | 18.751380 |
| train | -1.465835e-04 | 0.186631 | 0.559893 | |
| DNN | test | -1.781894e+00 | 6.886318 | 20.658954 |
| train | -6.292989e-01 | 4.853427 | 14.560280 | |
| LSTM | test | -1.605863e+00 | 4.874914 | 14.624743 |
| train | -1.234794e+00 | 3.182191 | 9.546573 | |
| LSTM_session | test | -1.553795e+00 | 4.819066 | 14.457198 |
| train | -1.176176e+00 | 3.127272 | 9.381815 |
Why 3 STD of the error?
for n_std in range(1,7):
x=np.linspace(-n_std,n_std,10000)
pdf=norm_pdf(x, mean=0, std=1)
print('> the population covered by -', n_std, 'and +', n_std, 'STD is', \
'{:.7f}'.format(100*np.trapz(y=pdf, x=x)) ,'%')
> the population covered by - 1 and + 1 STD is 68.2689491 % > the population covered by - 2 and + 2 STD is 95.4499733 % > the population covered by - 3 and + 3 STD is 99.7300203 % > the population covered by - 4 and + 4 STD is 99.9936657 % > the population covered by - 5 and + 5 STD is 99.9999427 % > the population covered by - 6 and + 6 STD is 99.9999998 %
+/- 3 SDT cover 99.73% of the population
Only Test set
df_comp_train=df_comp.loc[(slice(None), 'test'), :]
df_comp_train
| mean | std | _3_sdt | ||
|---|---|---|---|---|
| LinearRegression | test | -1.982219 | 6.940022 | 20.820066 |
| RandForest | test | -1.023539 | 6.250460 | 18.751380 |
| DNN | test | -1.781894 | 6.886318 | 20.658954 |
| LSTM | test | -1.605863 | 4.874914 | 14.624743 |
| LSTM_session | test | -1.553795 | 4.819066 | 14.457198 |
dfplot=df_comp_train['_3_sdt'].to_frame().reset_index(names=['models','train_test'])
dfplot
| models | train_test | _3_sdt | |
|---|---|---|---|
| 0 | LinearRegression | test | 20.820066 |
| 1 | RandForest | test | 18.751380 |
| 2 | DNN | test | 20.658954 |
| 3 | LSTM | test | 14.624743 |
| 4 | LSTM_session | test | 14.457198 |
Plot 3 STD error of each models
dfplot=df_comp['_3_sdt'].to_frame().reset_index(names=['models','train_test'])
dfplot.head()
| models | train_test | _3_sdt | |
|---|---|---|---|
| 0 | LinearRegression | test | 20.820066 |
| 1 | LinearRegression | train | 21.720483 |
| 2 | RandForest | test | 18.751380 |
| 3 | RandForest | train | 0.559893 |
| 4 | DNN | test | 20.658954 |
# Create sns bar plot
plt.figure(figsize=(8, 3))
sns.barplot(x='models', y='_3_sdt', hue='train_test', data=dfplot,width=0.2)
plt.legend( loc='upper right')
plt.grid()
plt.ylabel('maximum prediction error @ +3 std °C')
plt.title("Models comparison, +3 STD EOOR (couver 99.73% of the cases)")
plt.show()
The LSTM model is the best estimator, the difference between the first LSTM and LSTM_session is very small.